# Tutorial #5: Qualitative genome-scale modeling using public data, Flux Balance Analysis (FBA)Author: Marc Weber Date: March 2016 Affiliation: Center for Genomic Regulation (CRG), Barcelona, Spain. Notes: This tutorial was created for the course "Whole-cell modeling", taking place at CRG, Barcelona, April 3-8, 2016.Author: Marc Weber
Date: March 2016
Affiliation: Center for Genomic Regulation (CRG), Barcelona, Spain.
Notes: This tutorial was created for the course "Whole-cell modeling", taking place at CRG, Barcelona, April 3-8, 2016.
## Exercise 1: Flux Balance Analysis (FBA)In this exercise, we will learn to use the COBRA package to compute the optimal growth of E.coli under different conditions using flux balance analysis. We will use the full iJO1366 metabolic model of E.coli (Orth et al. 2011) that takes into accounts 1366 genes, 2251 metabolic reactions and 1136 unique metabolites. We will focus on the glucose central metabolism. The exercises are adapted from the tutorial "What is flux balance analysis?" by Orth, Thiele and Palsson (Orth et al. 2010).### Outline- Familiarizing with COBRA and Escher commands. - Get and download the iJO1366 from the BiGG database. - Loading iJO1366 model. - Drawing metabolic map. - Altering reaction bounds (adding and/or removing reactions). - Change the objective function (typically growth). - Solving for fluxes. - Draw metabolic map and fluxes using Escher in iPython. - Simulating optimal growth.- Examples: - Compute and draw fluxes for aerobic/anaerobic growth on glucose. - Growth on alternate substrates - compute growth and fluxes on succinate: aerobic / anaerobic (no solution) - compute growth and fluxes on puryvate: aerobic / anaerobic (no solution) - Simulating different objective function, maximize ATP production. - Simulating single gene deletion, gene essentiality - Simulating double gene deletions, synthetic lethalMore information:+ Orth, J. D., Thiele, I., & Palsson, B. Ø. (2010). What is flux balance analysis? Nature Biotechnology, 28(3), 245–248. http://doi.org/10.1038/nbt.1614+ Maarleveld, T. R., Khandelwal, R. a., Olivier, B. G., Teusink, B., & Bruggeman, F. J. (2013). Basic concepts and principles of stoichiometric modeling of metabolic networks. Biotechnology Journal, 8, 997–1008. http://doi.org/10.1002/biot.201200291+ Orth, J. D., Conrad, T. M., Na, J., Lerman, J. a, Nam, H., Feist, A. M., & Palsson, B. Ø. (2011). A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011. Molecular Systems Biology, 7(535), 1–9. http://doi.org/10.1038/msb.2011.65In this exercise, we will learn to use the COBRA package to compute the optimal growth of E.coli under different conditions using flux balance analysis. We will use the full iJO1366 metabolic model of E.coli (Orth et al. 2011) that takes into accounts 1366 genes, 2251 metabolic reactions and 1136 unique metabolites. We will focus on the glucose central metabolism. The exercises are adapted from the tutorial "What is flux balance analysis?" by Orth, Thiele and Palsson (Orth et al. 2010).
More information:
import escherimport escher.urlsimport cobraimport cobra.testimport jsonimport osimport pandasimport refrom IPython.display import HTMLimport seabornimport matplotlib.pyplot as plt%matplotlib inline# If you want to increase the default width of the notebook#HTML("<style>.container { width:100% !important; }</style>")# The default height of the metabolic maps as displayed in the notebookescherDefaultMapHeight = 800d = escher.urls.root_directoryprint('Escher directory: %s' % d)cobra.test.test_all()Escher is a web application for visualizing data on biological pathway maps.More information:- King, Z. A., Dräger, A., Ebrahim, A., Sonnenschein, N., Lewis, N. E., & Palsson, B. O. (2015). Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways. PLOS Computational Biology, 11(8), e1004321. http://doi.org/10.1371/journal.pcbi.1004321- Documentation https://escher.readthedocs.org/en/latest/getting_started.htmlEscher is a web application for visualizing data on biological pathway maps.
More information:
List available metabolic maps in the Escher package:List available metabolic maps in the Escher package:
escher.list_available_maps()Draw the metabolic map of the textbook example of E. coli central carbon metabolism:Draw the metabolic map of the textbook example of E. coli central carbon metabolism:
metabolicMap = escher.Builder(map_name='e_coli_core.Core metabolism')metabolicMap.display_in_notebook(height=escherDefaultMapHeight)Draw the map covering the central metabolism part of the full E. coli model iJO1366:Draw the map covering the central metabolism part of the full E. coli model iJO1366:
metabolicMap = escher.Builder(map_name='iJO1366.Central metabolism')metabolicMap.display_in_notebook(height=escherDefaultMapHeight)Browse the BiGG database (http://bigg.ucsd.edu/) and download the iJO1366 metabolic model of E. coli in JSON format. Save the file in the same directory as this ipython notebook.More information:+ iJO1366 metabolic model article: Orth, J. D., Conrad, T. M., Na, J., Lerman, J. a, Nam, H., Feist, A. M., & Palsson, B. Ø. (2011). A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011. Molecular Systems Biology, 7(535), 1–9. http://doi.org/10.1038/msb.2011.65Browse the BiGG database (http://bigg.ucsd.edu/) and download the iJO1366 metabolic model of E. coli in JSON format. Save the file in the same directory as this ipython notebook.
More information:
More information:+ COBRApy documentation https://cobrapy.readthedocs.org/en/stable/getting_started.htmlMore information:
Load the iJO1366 model with the COBRApy package.Load the iJO1366 model with the COBRApy package.
model = cobra.io.load_json_model('iJO1366.json')The model contains the lists of reactions, metabolites, and genes, which are attributes of the cobrapy model.The model contains the lists of reactions, metabolites, and genes, which are attributes of the cobrapy model.
print('Nb of reactions: ',len(model.reactions))print('Nb of metabolites: ',len(model.metabolites))print('Nb of genes: ',len(model.genes))Just like a regular list, objects in the reactions, metabolites and genes lists can be retrieved by index. For example, to get the 30th reaction in the model,Just like a regular list, objects in the reactions, metabolites and genes lists can be retrieved by index. For example, to get the 30th reaction in the model,
model.reactions[29]Additionally, items can be retrieved by their id using the get_by_id() function. For example, consider the reaction glucose 6-phosphate isomerase, which interconverts glucose 6-phosphate and fructose 6-phosphate. The reaction id for this reaction in our model is PGI.Additionally, items can be retrieved by their id using the get_by_id() function. For example, consider the reaction glucose 6-phosphate isomerase, which interconverts glucose 6-phosphate and fructose 6-phosphate. The reaction id for this reaction in our model is PGI.
pgi = model.reactions.get_by_id("PGI")With an interactive shell such as IPython, reactions can also be retrieved using tab-completion to list elements inside a list. While this is not recommended behavior for most code because of the possibility for characters like “-” inside ids, this is very useful while in an interactive prompt. Start typing `model.reactions.`, then the first characters of the reaction name, and hit tab to get a list of reactions.With
an interactive shell such as IPython, reactions can also be retrieved
using tab-completion to list elements inside a list. While this is not
recommended behavior for most code because of the possibility for
characters like “-” inside ids, this is very useful while in an
interactive prompt. Start typing model.reactions., then the first characters of the reaction name, and hit tab to get a list of reactions.
pgi = model.reactions.PGIWe can view the full name, reaction equation and metabolites with stochiometry as strings.We can view the full name, reaction equation and metabolites with stochiometry as strings.
print(pgi.name)print(pgi.reaction)print(pgi.metabolites)We can also view reaction upper and lower bounds. Because the pgi.lower_bound < 0, and pgi.upper_bound > 0, pgi is reversibleWe can also view reaction upper and lower bounds. Because the pgi.lower_bound < 0, and pgi.upper_bound > 0, pgi is reversible
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)print(pgi.reversibility)We define a function to print out the most relevant information about a gene in a cobra model.We define a function to print out the most relevant information about a gene in a cobra model.
def print_gene_info(gene): print("cobra_id: ",gene.id) print("name: ",gene.name) print("associated reactions:") for reac in gene.reactions: print(reac.id, ', ', reac.name) print()Try the print function with one gene of the model.Try the print function with one gene of the model.
print_gene_info(model.genes.b0841)solution = model.optimize()Simulations using flux balance analysis can be solved using Model.optimize(). This will maximize or minimize (maximizing is the default) flux through the objective reactions. The Model.optimize() function will return a Solution object, which will also be stored at model.solution. A solution object has several attributes:+ f: the objective value+ status: the status from the linear programming solver+ x_dict: a dictionary of {reaction_id: flux_value} (also called "primal")+ x: a list for x_dict+ y_dict: a dictionary of {metabolite_id: dual_value}.+ y: a list for y_dictFor example, after the last call to model.optimize(), the status should be 'optimal' if the solver returned no errors, and f should be the objective valueSimulations using flux balance analysis can be solved using Model.optimize(). This will maximize or minimize (maximizing is the default) flux through the objective reactions. The Model.optimize() function will return a Solution object, which will also be stored at model.solution. A solution object has several attributes:
For example, after the last call to model.optimize(), the status should be 'optimal' if the solver returned no errors, and f should be the objective value
print('Satus: ',solution.status)print('Growth rate: %.2f' % solution.f)### Maximum aerobic growth rateWe want to calculate the maximum aerobic growth of E. coli under the assumption that uptake of glucose, and not oxygen, is the limiting constraint on growth (aerobic conditions).We want to calculate the maximum aerobic growth of E. coli under the assumption that uptake of glucose, and not oxygen, is the limiting constraint on growth (aerobic conditions).
First we set the maximum glucose uptake rate to 18.5 mmol gDW<sup>-1</sup> h<sup>-1</sup> (millimoles per gram dry cell weight per hour, the default flux units used in the COBRA package). Knowing that the glucose exchange reaction id is `EX_glc__D_e`, print the reactions name, reaction equation, list of metabolites and lower and upper bounds. Then, set the new value of the lower bound to -18.5.By convention, exchange reactions are written as export reactions (e.g. `glc__D_e <==>`), so import of a metabolite is a negative flux. Maximum uptake rates are therefore expressed as the negative lower bounds of the corresponding exchange reaction with units [mmol gDW<sup>-1</sup> h<sup>-1</sup>].First we set the maximum glucose uptake rate to 18.5 mmol gDW-1 h-1
(millimoles per gram dry cell weight per hour, the default flux units
used in the COBRA package). Knowing that the glucose exchange reaction
id is EX_glc__D_e, print the reactions name, reaction
equation, list of metabolites and lower and upper bounds. Then, set the
new value of the lower bound to -18.5.
By convention, exchange reactions are written as export reactions (e.g. glc__D_e <==>),
so import of a metabolite is a negative flux. Maximum uptake rates are
therefore expressed as the negative lower bounds of the corresponding
exchange reaction with units [mmol gDW-1 h-1].
glc_exchange = model.reactions.get_by_id('EX_glc__D_e')print(glc_exchange.name)print(glc_exchange.reaction)print(glc_exchange.metabolites)print(glc_exchange.lower_bound, "< EX_glc__D_e <", glc_exchange.upper_bound)glc_exchange.lower_bound = -18.5print(glc_exchange.lower_bound, "< EX_glc__D_e <", glc_exchange.upper_bound)Second we allow unlimited oxygen uptake. Search the id of the oxygen (O2) uptake reaction of the iJO1366 model in the BiGG database. Change the lower bound of the oxygen uptake reaction to -1000. By setting the lower bound of the oxygen uptake reaction to such a large negative number, it is practically unbounded.Second we allow unlimited oxygen uptake. Search the id of the oxygen (O2) uptake reaction of the iJO1366 model in the BiGG database. Change the lower bound of the oxygen uptake reaction to -1000. By setting the lower bound of the oxygen uptake reaction to such a large negative number, it is practically unbounded.
o2_exchange = model.reactions.get_by_id('EX_o2_e')print(o2_exchange.name)print(o2_exchange.reaction)print(o2_exchange.metabolites)print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)o2_exchange.lower_bound = -1000print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)The objective function is determined from the objective_coefficient attribute of the objective reaction(s). In the iJO1366 model, there are two possible objective reactions, the “core” and “wild-type” biomass reactions. These are reactions that drain biomass precursor compounds in experimentally determined ratios to simulate growth. The “wild- type” biomass reaction contains the precursors to all the typical wild-type cellular components of E. coli, while the “core” biomass reaction contains the precursors only to essential components.We can find the id's of these reactions by searching for the string "biomass" in the reaction names.The objective function is determined from the objective_coefficient attribute of the objective reaction(s). In the iJO1366 model, there are two possible objective reactions, the “core” and “wild-type” biomass reactions. These are reactions that drain biomass precursor compounds in experimentally determined ratios to simulate growth. The “wild- type” biomass reaction contains the precursors to all the typical wild-type cellular components of E. coli, while the “core” biomass reaction contains the precursors only to essential components.
We can find the id's of these reactions by searching for the string "biomass" in the reaction names.
for reac in model.reactions: if re.search('biomass', reac.name, re.I): print(reac.name, ', id:', reac.id)Next, we set the objective function as the core biomass reaction, which contains the precursors only to essential components. The objective function can be set to a specific reaction by using the `model.change_objective` function.Next,
we set the objective function as the core biomass reaction, which
contains the precursors only to essential components. The objective
function can be set to a specific reaction by using the model.change_objective function.
model.change_objective('BIOMASS_Ec_iJO1366_core_53p95M')print(model.reactions.BIOMASS_Ec_iJO1366_WT_53p95M.reaction)Perform FBA with maximization of the biomass reaction as the objective.Perform FBA with maximization of the biomass reaction as the objective.
solution = model.optimize()Growth rate of the optimal solution.Growth rate of the optimal solution.
print('Aerobic conditions.')print('Satus: ',solution.status)print('Growth rate: %.2f' % solution.f)The optimal fluxes accross all reactions can be inspected by printing the attribute x_dict.The optimal fluxes accross all reactions can be inspected by printing the attribute x_dict.
solution.x_dictDraw the fluxes on the metabolic map. We observe that there is high flux in the glycolysis, pentose phosphate, TCA cycle, and oxidative phosphorylation pathways,Draw the fluxes on the metabolic map. We observe that there is high flux in the glycolysis, pentose phosphate, TCA cycle, and oxidative phosphorylation pathways,
metabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', reaction_data=solution.x_dict, # color and size according to the absolute value reaction_styles=['color', 'size', 'abs', 'text'], # change the default colors reaction_scale=[{'type': 'min', 'color': '#cccccc', 'size': 4}, {'type': 'mean', 'color': '#0000dd', 'size': 20}, {'type': 'max', 'color': '#ff0000', 'size': 40}], # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()### Maximum anaerobic growth rateIn order to simulate anaerobic conditions, we set the lower bound of the oxygen uptake reaction to zero, such that no oxygen can enter the system.In order to simulate anaerobic conditions, we set the lower bound of the oxygen uptake reaction to zero, such that no oxygen can enter the system.
print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)o2_exchange.lower_bound = 0print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)Perform FBA with maximization of the biomass reaction as the objective.Perform FBA with maximization of the biomass reaction as the objective.
solution = model.optimize()print('Anaerobic conditions.')print('Satus: ',solution.status)print('Growth rate: %.2f' % solution.f)When `model.optimize()` is used as before, the resulting growth rate is now much lower, 0.48 hr^-1.When model.optimize() is used as before, the resulting growth rate is now much lower, 0.48 hr^-1.
Draw the fluxes on the metabolic map. The flux distribution shows that oxidative phosphorylation is not used in these conditions, and that acetate, formate, and ethanol are produced by fermentation pathwaysDraw the fluxes on the metabolic map. The flux distribution shows that oxidative phosphorylation is not used in these conditions, and that acetate, formate, and ethanol are produced by fermentation pathways
metabolicMap.reaction_data=solution.x_dictmetabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', reaction_data=solution.x_dict, # color and size according to the absolute value reaction_styles=['color', 'size', 'abs', 'text'], # change the default colors reaction_scale=[{'type': 'min', 'color': '#cccccc', 'size': 4}, {'type': 'mean', 'color': '#0000dd', 'size': 20}, {'type': 'max', 'color': '#ff0000', 'size': 40}], # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()### Growth on alternate substratesJust as FBA was used to calculate growth rates of E. coli on glucose, it can also be used to simulate growth on other substrates. The iJO1366 computational model contains exchange reactions for 324 different compounds, 285 of these compounds contain at least one carbon atom. It is therefore possible to use iJO1366 to predict the growth capabilities of E. coli on a very wide range of media conditions.As an example, we will simulate growth on succinate instead of glucose.Just as FBA was used to calculate growth rates of E. coli on glucose, it can also be used to simulate growth on other substrates. The iJO1366 computational model contains exchange reactions for 324 different compounds, 285 of these compounds contain at least one carbon atom. It is therefore possible to use iJO1366 to predict the growth capabilities of E. coli on a very wide range of media conditions.
As an example, we will simulate growth on succinate instead of glucose.
List all the exchange reactions. Iterate over the reactions and search for the string 'EX' in the reaction id (use `re.search` for string comparison). If the reaction id indicates an exchange reaction, print out the reaction name and id. Print also the total number of exchange reactions.List all the exchange reactions. Iterate over the reactions and search for the string 'EX' in the reaction id (use re.search
for string comparison). If the reaction id indicates an exchange
reaction, print out the reaction name and id. Print also the total
number of exchange reactions.
i = 0for reac in model.reactions: if re.search(r'^EX', reac.id): print(reac.name, ', id:', reac.id) i += 1 print('Nb of exchange reactions: ',i)Search for the succinate exchange reaction id in the list of reactions. Iterate over the reactions and use string comparison `re.search` to search for the reaction name.Search for the succinate exchange reaction id in the list of reactions. Iterate over the reactions and use string comparison re.search to search for the reaction name.
for reac in model.reactions: if re.search(r'succinate', reac.name, re.I) and re.search(r'exchange', reac.name, re.I): print(reac.name, ', id:', reac.id)Set the following conditions: aerobic growth (illimited o2 uptake), no glucose and succinate (maximum uptake rate: 20). Then, compute the growth rate and plot the fluxes on the metabolic map.Set the following conditions: aerobic growth (illimited o2 uptake), no glucose and succinate (maximum uptake rate: 20). Then, compute the growth rate and plot the fluxes on the metabolic map.
# aerobic conditionso2_exchange.lower_bound = -1000print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)# No glucoseglc_exchange.lower_bound = 0print(glc_exchange.lower_bound, "< EX_glc__D_e <", glc_exchange.upper_bound)# Succinate uptakesucc_exchange = model.reactions.get_by_id('EX_succ_e')succ_exchange.lower_bound = -20print(succ_exchange.lower_bound, "< EX_succ_e <", succ_exchange.upper_bound)solution = model.optimize()print('Growth rate: %.2f' % solution.f)# Set back to default valuessucc_exchange.lower_bound = 0glc_exchange.lower_bound = -18.5metabolicMap.reaction_data=solution.x_dictmetabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', reaction_data=solution.x_dict, # color and size according to the absolute value reaction_styles=['color', 'size', 'abs', 'text'], # change the default colors reaction_scale=[{'type': 'min', 'color': '#cccccc', 'size': 4}, {'type': 'mean', 'color': '#0000dd', 'size': 20}, {'type': 'max', 'color': '#ff0000', 'size': 40}], # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()Similarly, we compute the growth rate on succinate under anaerobic conditions.First, we set the maximum o2 uptake rate to 0.Similarly, we compute the growth rate on succinate under anaerobic conditions.
First, we set the maximum o2 uptake rate to 0.
# anaerobic conditionso2_exchange.lower_bound = 0print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)# No glucoseglc_exchange.lower_bound = 0# Succinate uptakesucc_exchange.lower_bound = -20Then, we compute the optimal solution and print the status from the solver.Then, we compute the optimal solution and print the status from the solver.
solution = model.optimize()print('Satus: ',solution.status)# Set back to default valuessucc_exchange.lower_bound = 0glc_exchange.lower_bound = -18.5In this case, the optimal solution does not exists and the linear programming solver of the COBRA package returns the status "infeasible". FBA predicts that growth is not possible on succinate under anaerobic conditions, because the maximum amount of ATP that can be produced from this amount of succinate is less than the minimum bound of the non-growth associated maintenance (NGAM) reaction.Growth-associated maintenance (GAM) and non-growth-associated maintenance (NGAM) are the amounts of ATP consumed during cell growth and by non-growth associated processes such as maintenance of membrane gradients, respectively. GAM is a component of the biomass reaction, while NGAM is manifest as a lower bound on the separate ATP draining reaction “ATPM.”<img src="Images/Thiele2010_growth_associated_maintenance_plot.png" width="400" />_Thiele, I., & Palsson, B. Ø. (2010). A protocol for generating a high-quality genome-scale metabolic reconstruction. Nature Protocols, 5(1), 93–121._Print the lower bound of the ATPM reaction in the iJO1366 model, reaction id `ATPM`.In this case, the optimal solution does not exists and the linear programming solver of the COBRA package returns the status "infeasible". FBA predicts that growth is not possible on succinate under anaerobic conditions, because the maximum amount of ATP that can be produced from this amount of succinate is less than the minimum bound of the non-growth associated maintenance (NGAM) reaction.
Growth-associated maintenance (GAM) and non-growth-associated maintenance (NGAM) are the amounts of ATP consumed during cell growth and by non-growth associated processes such as maintenance of membrane gradients, respectively. GAM is a component of the biomass reaction, while NGAM is manifest as a lower bound on the separate ATP draining reaction “ATPM.”
Thiele, I., & Palsson, B. Ø. (2010). A protocol for generating a
high-quality genome-scale metabolic reconstruction. Nature Protocols,
5(1), 93–121.
Print the lower bound of the ATPM reaction in the iJO1366 model, reaction id ATPM.
model.reactions.ATPM.lower_boundThe minimum amount of ATP molecules consumed for non-growth maintenance is 3.15 mmol gDW<sup>-1</sup> h<sup>-1</sup>.The minimum amount of ATP molecules consumed for non-growth maintenance is 3.15 mmol gDW-1 h-1.
Repeat the same steps to compute the growth rate under aerobic and anaerobic conditions on pyruvate.Hint: to search for the pyruvate exchange reaction, use IPython list tab completion on the `model.reactions` object, knowing that all exchange reactions start with "EX_".Repeat the same steps to compute the growth rate under aerobic and anaerobic conditions on pyruvate.
Hint: to search for the pyruvate exchange reaction, use IPython list tab completion on the model.reactions object, knowing that all exchange reactions start with "EX_".
# aerobic conditionso2_exchange.lower_bound = -1000print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)# No glucoseglc_exchange.lower_bound = 0print(glc_exchange.lower_bound, "< EX_glc__D_e <", glc_exchange.upper_bound)# Pyruvate uptakepyruvate_exchange = model.reactions.EX_pyr_epyruvate_exchange.lower_bound = -20print(pyruvate_exchange.lower_bound, "< EX_pyr_e <", pyruvate_exchange.upper_bound)solution = model.optimize()print('Growth rate: %.2f' % solution.f)metabolicMap.reaction_data=solution.x_dictmetabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', reaction_data=solution.x_dict, # color and size according to the absolute value reaction_styles=['color', 'size', 'abs', 'text'], # change the default colors reaction_scale=[{'type': 'min', 'color': '#cccccc', 'size': 4}, {'type': 'mean', 'color': '#0000dd', 'size': 20}, {'type': 'max', 'color': '#ff0000', 'size': 40}], # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()# anaerobic conditionso2_exchange.lower_bound = 0print(o2_exchange.lower_bound, "< EX_o2_e <", o2_exchange.upper_bound)# No glucoseglc_exchange.lower_bound = 0# Pyruvate uptakepyruvate_exchange.lower_bound = -20solution = model.optimize()print('Growth rate: %.2f' % solution.f)# Set back to default valuesglc_exchange.lower_bound = -18.5succ_exchange.lower_bound = 0pyruvate_exchange.lower_bound = 0metabolicMap.reaction_data=solution.x_dictmetabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', reaction_data=solution.x_dict, # color and size according to the absolute value reaction_styles=['color', 'size', 'abs', 'text'], # change the default colors reaction_scale=[{'type': 'min', 'color': '#cccccc', 'size': 4}, {'type': 'mean', 'color': '#0000dd', 'size': 20}, {'type': 'max', 'color': '#ff0000', 'size': 40}], # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()### Simulating different objective function: maximize ATP production.FBA can also be used to determine the maximum yields of important cofactors and biosynthetic precursors from glucose and other substrates.In this example, we will compute the maximum yields of the cofactors ATP, NADH, and NADPH from glucose under aerobic conditions. To calculate the optimal ATP production, first constrain the glucose exchange reaction to exactly -1 mmol gDW<sup>-1</sup> h<sup>-1</sup> by setting both the lower and upper bounds to -1.FBA can also be used to determine the maximum yields of important cofactors and biosynthetic precursors from glucose and other substrates.
In this example, we will compute the maximum yields of the cofactors ATP, NADH, and NADPH from glucose under aerobic conditions. To calculate the optimal ATP production, first constrain the glucose exchange reaction to exactly -1 mmol gDW-1 h-1 by setting both the lower and upper bounds to -1.
# aerobic conditionso2_exchange.lower_bound = -1000# No other carbon sourcessucc_exchange.lower_bound = 0pyruvate_exchange.lower_bound = 0# Constrain glucose uptakeglc_exchange.lower_bound = -1glc_exchange.upper_bound = -1print(glc_exchange.lower_bound, "< EX_glc__D_e <", glc_exchange.upper_bound)We can check the maximum rate for all the uptake reactions by listing them. Iterate over the reactions and search for reaction that contain exchange in their name and with a non-zero lower bound. Print the reaction bounds, reaction name and id.We can check the maximum rate for all the uptake reactions by listing them. Iterate over the reactions and search for reaction that contain exchange in their name and with a non-zero lower bound. Print the reaction bounds, reaction name and id.
# Print all the uptake reactions with non-zero lower boundfor reac in model.reactions: if re.search(r'exchange', reac.name, re.I) and reac.lower_bound < 0: print('{: 8.1f} < v < {: 8.1f}'.format(reac.lower_bound, reac.upper_bound),', ', reac.name, ', id:', reac.id)Next, set the ATP maintenance reaction (id `ATPM`) as the objective to be maximized using the `change_objective` function. ATPM is a stoichiometrically balanced reaction that hydrolyzes ATP (`atp_c`) and produces ADP (`adp_c`), inorganic phosphate (`pi_c`), and a proton (`h_c`). It works as an objective for maximizing ATP production because in order to consume the maximum amount of ATP, the network must first produce ATP by the most efficient pathways available by recharging the produced ADP.Next, set the ATP maintenance reaction (id ATPM) as the objective to be maximized using the change_objective function. ATPM is a stoichiometrically balanced reaction that hydrolyzes ATP (atp_c) and produces ADP (adp_c), inorganic phosphate (pi_c), and a proton (h_c).
It works as an objective for maximizing ATP production because in order
to consume the maximum amount of ATP, the network must first produce
ATP by the most efficient pathways available by recharging the produced
ADP.
model.change_objective('ATPM')model.reactions.ATPM.reactionAs we have seen above (anaerobic growth on succinate), by default this reaction has a lower bound of 3.15 mmol gDW<sup>-1</sup> h<sup>-1</sup> to simulate non-growth associated maintenance costs. Remove the constraint on this reaction by setting the lower bounds to 0 and the upper bound to 1000.As we have seen above (anaerobic growth on succinate), by default this reaction has a lower bound of 3.15 mmol gDW-1 h-1 to simulate non-growth associated maintenance costs. Remove the constraint on this reaction by setting the lower bounds to 0 and the upper bound to 1000.
model.reactions.ATPM.lower_bound = 0model.reactions.ATPM.upper_bound = 1000Calculate the optimal solution which should give the maximum yield of ATP. Now the optimal objective value is not the biomass production anymore, but rather the flux through the ATP production reaction in units of mmol gDW<sup>-1</sup> h<sup>-1</sup>. Because the uptake rate of glucose has been set to exactly -1 mmol gDW<sup>-1</sup> h<sup>-1</sup>, we can obtain easily the ratio of ATP molecules produced per glucose molecules consumed. The result should be 23.5 mol ATP/mol glucose.Calculate the optimal solution which should give the maximum yield of ATP. Now the optimal objective value is not the biomass production anymore, but rather the flux through the ATP production reaction in units of mmol gDW-1 h-1. Because the uptake rate of glucose has been set to exactly -1 mmol gDW-1 h-1, we can obtain easily the ratio of ATP molecules produced per glucose molecules consumed. The result should be 23.5 mol ATP/mol glucose.
solution = model.optimize()print('Maximum ATP production rate:', solution.f, 'mol ATP/mol glucose')metabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', reaction_data=solution.x_dict, # color and size according to the absolute value reaction_styles=['color', 'size', 'abs', 'text'], # change the default colors reaction_scale=[{'type': 'min', 'color': '#cccccc', 'size': 4}, {'type': 'mean', 'color': '#0000dd', 'size': 20}, {'type': 'max', 'color': '#ff0000', 'size': 40}], # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()The sensitivity of an FBA solution is indicated by either shadow prices or reduced costs. Shadow prices are the derivative of the objective function with respect to the exchange flux of a metabolite. They indicate how much the addition of that metabolite will increase or decrease the objective. Shadow prices give us an indication of the growth-limiting compounds in the medium.The reduced cost is a quantity similar to the shadow price, but associated with reactions rather than metabolites. Reduced costs are the derivatives of the objective function with respect to an internal reaction, indicating how much each particular reaction affects the objective.In the COBRApy package, after computing an optimal solution for a model, shadow prices can be accessed for each metabolite through the y attribute (`metabolite.y`) (see [COBRApy documentation](http://cobrapy.readthedocs.org/en/latest/cobra.core.html#module-cobra.core.Metabolite)). Note that the sign of the `y` attribute should be changed to get the standard interpretation of a positive shadow price: an increase in metabolite uptake (relaxing constraint) leads to a positive increase in the objective function (e.g. biomass production).More information:+ Price, N. D., Reed, J. L., Palsson, B. Ø., & Correspondence, B. Ø. P. (2004). Genome-Scale Models of Microbial Cells : Evaluating the Consequences of Constraints, 2(November), 886–897. http://doi.org/10.1038/nrmicro1023+ Goffin, P., van de Bunt, B., Giovane, M., Leveau, J. H. J., Höppener-Ogawa, S., Teusink, B., & Hugenholtz, J. (2010). Understanding the physiology of Lactobacillus plantarum at zero growth. Molecular Systems Biology, 6(413), 413. http://doi.org/10.1038/msb.2010.67+ Teusink, B., Wiersma, A., Molenaar, D., Francke, C., De Vos, W. M., Siezen, R. J., & Smid, E. J. (2006). Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. Journal of Biological Chemistry, 281(52), 40041–40048. http://doi.org/10.1074/jbc.M606263200The sensitivity of an FBA solution is indicated by either shadow prices or reduced costs. Shadow prices are the derivative of the objective function with respect to the exchange flux of a metabolite. They indicate how much the addition of that metabolite will increase or decrease the objective. Shadow prices give us an indication of the growth-limiting compounds in the medium.
The reduced cost is a quantity similar to the shadow price, but associated with reactions rather than metabolites. Reduced costs are the derivatives of the objective function with respect to an internal reaction, indicating how much each particular reaction affects the objective.
In the COBRApy package, after computing an optimal solution for a
model, shadow prices can be accessed for each metabolite through the y
attribute (metabolite.y) (see COBRApy documentation). Note that the sign of the y
attribute should be changed to get the standard interpretation of a
positive shadow price: an increase in metabolite uptake (relaxing
constraint) leads to a positive increase in the objective function (e.g.
biomass production).
More information:
Compute the shadow price of cytosolic proton `h_c` and of glucose `glc__D_c`.Compute the shadow price of cytosolic proton h_c and of glucose glc__D_c.
model.metabolites.h_c.ymodel.metabolites.glc__D_c.yAs expected, the shadow price of glucose is 23.5 mol ATP/ mol glucose, the same value as we have found above.ATP production is also limited by cellular proton balancing. The shadow price of cytosolic protons (`h_c`) is -0.25, indicating that the increase of 4 mol protons/mol glucose to the system decreases ATP yield by 1 mol ATP/mol glucose. Protons are produced by various metabolic reactions and are also pumped into the cell by the ATP synthase reaction. In order for the system to be at steady-state, an equal number of protons must be pumped out by the electron transport chain reactions or by excreting metabolites with symporters. If more ATP were to be produced by ATP synthase, it would import additional protons that have no way to escape the cell.As expected, the shadow price of glucose is 23.5 mol ATP/ mol glucose, the same value as we have found above.
ATP production is also limited by cellular proton balancing. The shadow price of cytosolic protons (h_c)
is -0.25, indicating that the increase of 4 mol protons/mol glucose to
the system decreases ATP yield by 1 mol ATP/mol glucose. Protons are
produced by various metabolic reactions and are also pumped into the
cell by the ATP synthase reaction. In order for the system to be at
steady-state, an equal number of protons must be pumped out by the
electron transport chain reactions or by excreting metabolites with
symporters. If more ATP were to be produced by ATP synthase, it would
import additional protons that have no way to escape the cell.
Calculation of NADH and NADPH one at a time can be performed in a similar manner. First, constrain `ATPM` to 0 mmol gDW<sup>-1</sup> h<sup>-1</sup> flux so the cell is not required to produce ATP, and also cannot consume any ATP using this reaction.Calculation of NADH and NADPH one at a time can be performed in a similar manner. First, constrain ATPM to 0 mmol gDW-1 h-1 flux so the cell is not required to produce ATP, and also cannot consume any ATP using this reaction.
# Constrain glucose uptakeglc_exchange.lower_bound = -1glc_exchange.upper_bound = -1# Constrain ATPM reaction fluxmodel.reactions.ATPM.lower_bound = 0model.reactions.ATPM.upper_bound = 0Add stoichiometrically balanced NADH consuming reactions using the function `model.add_reaction`.Any arbitrary reaction can be defined in the COBRA package. The easiest way to add a new reaction to an existing model is the following:+ create a `cobra.Reaction` object and set its id to 'NADH_consume' and set its name attribute.+ add the reaction object to the model using the `model.add_reaction` function+ define the reaction equation and metabolites by using the `reaction.build_reaction_from_string` function. A simple string containing the reaction equation using the correct metabolites id's is enough to define the metabolites and their stochiometric coefficients.+ print the newly defined reaction equationAdd stoichiometrically balanced NADH consuming reactions using the function model.add_reaction.
Any arbitrary reaction can be defined in the COBRA package. The easiest way to add a new reaction to an existing model is the following:
cobra.Reaction object and set its id to 'NADH_consume' and set its name attribute.model.add_reaction functionreaction.build_reaction_from_string
function. A simple string containing the reaction equation using the
correct metabolites id's is enough to define the metabolites and their
stochiometric coefficients.# Create the reaction objectreaction = cobra.Reaction('NADH_consume')reaction.name = 'NADH consuming reaction'# Add the reaction object to the modelmodel.add_reaction(reaction)# Define the reaction equation and metabolites using a string. Example met1_c + met2_c -> met3_creaction.build_reaction_from_string('nadh_c -> nad_c + h_c', verbose=True)print(model.reactions.NADH_consume.reaction) Set the NADH consuming reaction as the objective using `model.change_objective`. Set the NADH consuming reaction as the objective using model.change_objective.
model.change_objective('NADH_consume')Calculate the optimal solution which should give the maximum yield of NADH. Now the optimal objective value is the flux through the NADH consuming reaction in units of mmol gDW<sup>-1</sup> h<sup>-1</sup>. Because the uptake rate of glucose has been set to exactly -1 mmol gDW<sup>-1</sup> h<sup>-1</sup>, we can obtain easily the ratio of NADH molecules produced per glucose molecules consumed. The result should be 10.29 mol NADH/mol glucose.Calculate the optimal solution which should give the maximum yield of NADH. Now the optimal objective value is the flux through the NADH consuming reaction in units of mmol gDW-1 h-1. Because the uptake rate of glucose has been set to exactly -1 mmol gDW-1 h-1, we can obtain easily the ratio of NADH molecules produced per glucose molecules consumed. The result should be 10.29 mol NADH/mol glucose.
solution = model.optimize()print('Maximum NADH production rate:', solution.f, 'mol NADH/mol glucose')Then, print the shadow price of cytosolic proton (`h_c`) and ATP (`atp_c`).Then, print the shadow price of cytosolic proton (h_c) and ATP (atp_c).
model.metabolites.h_c.ymodel.metabolites.atp_c.yNADH and NADPH production are also ultimately limited by proton balancing. Formaximum NADH yield, the proton shadow price is -0.07 and ATP shadow price 15.4. The protons produced in metabolism are removed by `ATPS4rpp` in reverse (with a negative flux), which consumes ATP and exports protons to the periplasm. The stoichiometry of the network also limits the production of NADH and NADPH. Glucose has 12 electron pairs that can be donated to redox carriers such as NAD+ or NADP+ , but when the TCA cycle is used, two of these electron pairs are usedto reduce the quinone `q8_c` in the succinate dehydrogenase reaction (`SUCDi`), and thus cannot be used to produce NADH or NADPH. The only pathway that can reduce 12 redox carriers with one molecule glucose is the pentose phosphate pathway, but this is infeasible because this pathway generates no ATP, which is needed to pump out the protons that are produced.NADH
and NADPH production are also ultimately limited by proton balancing.
For
maximum NADH yield, the proton shadow price is -0.07 and ATP shadow
price 15.4. The protons produced in metabolism are removed by ATPS4rpp
in reverse (with a negative flux), which consumes ATP and exports
protons to the periplasm. The stoichiometry of the network also limits
the production of NADH and NADPH. Glucose has 12 electron pairs that can
be donated to redox carriers such as NAD+ or NADP+ , but when the TCA
cycle is used, two of these electron pairs are used
to reduce the quinone q8_c in the succinate dehydrogenase reaction (SUCDi),
and thus cannot be used to produce NADH or NADPH. The only pathway that
can reduce 12 redox carriers with one molecule glucose is the pentose
phosphate pathway, but this is infeasible because this pathway generates
no ATP, which is needed to pump out the protons that are produced.
Repeat the same analysis and compute the maximal yield for NADPH. The result should be 9.36 mol NADH/mol glucose.Repeat the same analysis and compute the maximal yield for NADPH. The result should be 9.36 mol NADH/mol glucose.
# Create the reaction objectreaction = cobra.Reaction('NADPH_consume')reaction.name = 'NADPH consuming reaction'# Add the reaction object to the modelmodel.add_reaction(reaction)# Define the reaction equation and metabolites using a string. Example A + B -> Creaction.build_reaction_from_string('nadph_c -> nadp_c + h_c', verbose=True)print(model.reactions.NADPH_consume.reaction)# Change objective function to the NADPH_consume reactionmodel.change_objective('NADPH_consume')# Compute optimal solutionsolution = model.optimize()print('Maximum NADPH production rate:', solution.f, 'mol NADPH/mol glucose')# Reload the metabolic model with default valuesmodel = cobra.io.load_json_model('iJO1366.json')model.change_objective('BIOMASS_Ec_iJO1366_core_53p95M')# Define aliases for the glucose and oxygen uptake reactionsglc_exchange = model.reactions.get_by_id('EX_glc__D_e')o2_exchange = model.reactions.get_by_id('EX_o2_e')Compute the growth in the WT strain with the `model.optimize` function. The default conditions are aerobic growth on glucose with uptake rate of 10 mmol gDW<sup>-1</sup> h<sup>-1</sup>.Compute the growth in the WT strain with the model.optimize function. The default conditions are aerobic growth on glucose with uptake rate of 10 mmol gDW-1 h-1.
solution = model.optimize()print('WT growth rate: {:.3f} h^-1'.format(solution.f))#### Gene-protein-reaction rulesJust as growth in different environments can be simulated with FBA, gene knockouts can also be simulated by changing reaction bounds. To simulate the knockout of any gene, its associated reaction or reactions can simply be constrained to not carry flux. By setting both the upper and lower bounds of a reaction to 0 mmol gDW<sup>-1</sup> h<sup>-1</sup> , a reaction is essentially knocked out, and is restricted from carrying flux. The E. coli core model, like many other constraint-based models, contains a list of gene-protein-reaction interactions (GPRs), a list of Boolean rules that dictate which genes are connected with each reaction in the model. When a reaction is catalyzed by isozymes (two different enzymes that catalyze the same reaction), the associated GPR contains an “or” rule, where either of two or more genes may be knocked out but the reaction will not be constrained. For example, the GPR for phosphofructokinase (`PFK`) is “b1723 (pfkB) or b3916 (pfkA),” so according to this Boolean rule, both pfkB and pfkA must be knocked out to restrict this reaction. When a reaction is catalyzed by a protein with multiple essential subunits, the GPR contains an “and” rule, and if any of the genes are knocked out the reaction will be constrained to 0 flux. Succinyl-CoA synthetase (SUCOAS), for example, has the GPR “b0728 (sucC) and b0729 (sucD),” so knocking out either of these genes will restrict this reaction. Some reactions are catalyzed by a single gene product, while others may be associated with ten or more genes in complex associations.The gene-reaction rules can be accessed with the `gene_reaction_rule` attribute of the reaction. Example:Just as growth in different environments can be simulated with FBA,
gene knockouts can also be simulated by changing reaction bounds. To
simulate the knockout of any gene, its associated reaction or reactions
can simply be constrained to not carry flux. By setting both the upper
and lower bounds of a reaction to 0 mmol gDW-1 h-1
, a reaction is essentially knocked out, and is restricted from
carrying flux. The E. coli core model, like many other constraint-based
models, contains a list of gene-protein-reaction interactions (GPRs), a
list of Boolean rules that dictate which genes are connected with each
reaction in the model. When a reaction is catalyzed by isozymes (two
different enzymes that catalyze the same reaction), the associated GPR
contains an “or” rule, where either of two or more genes may be knocked
out but the reaction will not be constrained. For example, the GPR for
phosphofructokinase (PFK) is “b1723 (pfkB) or b3916
(pfkA),” so according to this Boolean rule, both pfkB and pfkA must be
knocked out to restrict this reaction. When a reaction is catalyzed by a
protein with multiple essential subunits, the GPR contains an “and”
rule, and if any of the genes are knocked out the reaction will be
constrained to 0 flux. Succinyl-CoA synthetase (SUCOAS), for example,
has the GPR “b0728 (sucC) and b0729 (sucD),” so knocking out either of
these genes will restrict this reaction. Some reactions are catalyzed by
a single gene product, while others may be associated with ten or more
genes in complex associations.
The gene-reaction rules can be accessed with the gene_reaction_rule attribute of the reaction. Example:
print(model.reactions.PFK.gene_reaction_rule)print(model.reactions.SUCOAS.gene_reaction_rule)#### Single gene deletionThe COBRA package contains a function called `single_deletion` that uses the GPRs to constrain the correct reactions and computes the growth rate by FBA optimization. For example, growth can be predicted for E. coli growing aerobically on glucose with the gene _zwf_ (id `b1852`), corresponding to the reaction glucose-6-phosphate dehydrogenase (G6PDH2r), knocked out.More information:+ COBRApy documentation http://cobrapy.readthedocs.org/en/latest/deletions.htmlThe COBRA package contains a function called single_deletion
that uses the GPRs to constrain the correct reactions and computes the
growth rate by FBA optimization. For example, growth can be predicted
for E. coli growing aerobically on glucose with the gene zwf (id b1852), corresponding to the reaction glucose-6-phosphate dehydrogenase (G6PDH2r), knocked out.
More information:
First, print the reaction list associated with the _zwf_ gene (id `b1852`). Gene object can be accessed through the `model.genes` dictionary. The `reactions` attribute contains all the reactions that are associated with the gene as a python frozen set. Use a for loop to print all the reactions and their name. In the case of gene `b1852`, there is only one associated reaction.First, print the reaction list associated with the zwf gene (id b1852). Gene object can be accessed through the model.genes dictionary. The reactions
attribute contains all the reactions that are associated with the gene
as a python frozen set. Use a for loop to print all the reactions and
their name. In the case of gene b1852, there is only one associated reaction.
for reac in model.genes.b1852.reactions: print(reac, ' ', reac.name)We have also defined above a function to print out the most relevant information about a gene. It takes as argument the gene object. Try it with the same gene `b1852`.We
have also defined above a function to print out the most relevant
information about a gene. It takes as argument the gene object. Try it
with the same gene b1852.
print_gene_info(model.genes.b1852)Compute the growth rate for the strain with gene _zwf_ gene (id `b1852`) knocked out. The `single_deletion` function is part of the subpackage `cobra.flux_analysis`. It takes as first argument the model object and as second argument a set of gene objects (hint: use `{}` to pass a set of objects).Compute the growth rate for the strain with gene zwf gene (id b1852) knocked out. The single_deletion function is part of the subpackage cobra.flux_analysis. It takes as first argument the model object and as second argument a set of gene objects (hint: use {} to pass a set of objects).
growth_rate, status = cobra.flux_analysis.single_gene_deletion(model, {model.genes.b1852} )print('zwf- growth rate: {:.3f}, status {}'.format(growth_rate['b1852'], status['b1852']))The FBA predicted growth rate of the mutant strain is 0.976 h<sup>-1</sup>, which is slightly lower than the growth rate of 0.982 h<sup>-1</sup> for wild-type E. coli because the cell can no longer use the oxidative branch of the pentose phosphate pathway to generate NADPH.The FBA predicted growth rate of the mutant strain is 0.976 h-1, which is slightly lower than the growth rate of 0.982 h-1 for wild-type E. coli because the cell can no longer use the oxidative branch of the pentose phosphate pathway to generate NADPH.
Using FBA to predict the phenotypes of gene knockouts is especially useful in predicting essential genes. When no alternative pathway or isoenzyme exist to substitute the deleted reactions associated to the knocked out gene, the reported growth rate is zero and the gene is reported as essential in the analysis.Print the reactions associated with gene _icd_ (`b1136`).Using FBA to predict the phenotypes of gene knockouts is especially useful in predicting essential genes. When no alternative pathway or isoenzyme exist to substitute the deleted reactions associated to the knocked out gene, the reported growth rate is zero and the gene is reported as essential in the analysis.
Print the reactions associated with gene icd (b1136).
for reac in model.genes.b1136.reactions: print(reac, ' ', reac.name, ' ', reac.reaction)Compute the single gene deletion growth rate of gene _icd_ (`b1136`).Compute the single gene deletion growth rate of gene icd (b1136).
growth_rate, status = cobra.flux_analysis.single_gene_deletion(model, {model.genes.b1136} )print('icd- growth rate: {:.3f}, status {}'.format(growth_rate['b1136'], status['b1136']))The gene _icd_ encodes for the enzyme that catalyzes the oxidative decarboxylation of isocitrate, producing $\alpha$-ketoglutarate ($\alpha$-KG) and CO2. The reaction is part of the TCA cycle. Although the TCA cycle allows maximal energy yield from degradation of glucose, its more critical role is to supply precursors for biosynthetic pathways that branch from $\alpha$-ketoglutarate, oxaloacetate, and succinyl CoA. There is only one way to make $\alpha$-KG, so the segment of the cycle leading to $\alpha$-KG is essential.Draw the metabolic map with the knocked-out reaction highlighted in red.The gene icd encodes for the enzyme that catalyzes the oxidative decarboxylation of isocitrate, producing -ketoglutarate (-KG) and CO2. The reaction is part of the TCA cycle. Although the TCA cycle allows maximal energy yield from degradation of glucose, its more critical role is to supply precursors for biosynthetic pathways that branch from -ketoglutarate, oxaloacetate, and succinyl CoA. There is only one way to make -KG, so the segment of the cycle leading to -KG is essential.
Draw the metabolic map with the knocked-out reaction highlighted in red.
model_modified = model.copy()model_modified.reactions.ICDHyr.delete()metabolicMap = escher.Builder(map_name='iJO1366.Central metabolism', model=model_modified, # in the map, highlight all reactions that are missing from the model highlight_missing=True, # only show the primary metabolites hide_secondary_metabolites=True)metabolicMap.display_in_notebook()More information:+ Kim, J., & Copley, S. D. (2007). Why metabolic enzymes are essential or nonessential for growth of Escherichia coli K12 on glucose. Biochemistry, 46(44), 12501–12511. http://doi.org/10.1021/bi7014629<img src="Images/kim2007_metabolic_map_essential_genes.png" width="720" />Legend: _Overview of the central metabolic pathways of E. coli. Pathways are indicated by shading as follows: green for the Embden- Myerhof-Parnase pathway, yellow for the pentose phosphate pathway, pink for the Entner-Doudoroff pathway, and blue for the TCA cycle. Steps catalyzed by enzymes that that are essential for growth on glucose are denoted with red arrows, and the genes are colored red. Genes encoding subunits of multisubunit enzymes are enclosed in brackets. Where deletion of genes encoding two isozymes results in loss of the ability to grow on glucose, the genes are colored blue._More information:

Legend: Overview of the central metabolic pathways of E. coli. Pathways are indicated by shading as follows: green for the Embden- Myerhof-Parnase pathway, yellow for the pentose phosphate pathway, pink for the Entner-Doudoroff pathway, and blue for the TCA cycle. Steps catalyzed by enzymes that that are essential for growth on glucose are denoted with red arrows, and the genes are colored red. Genes encoding subunits of multisubunit enzymes are enclosed in brackets. Where deletion of genes encoding two isozymes results in loss of the ability to grow on glucose, the genes are colored blue.
Because FBA can compute phenotypes very quickly, it is feasible to use it for large scale computational screens for gene essentiality, including screens for two or more simultaneous knockouts.Compute the growth rate of all single gene deletions by passing the list of all genes to the `single_deletion` function, part of the `cobra.flux_analysis` module.Because FBA can compute phenotypes very quickly, it is feasible to use it for large scale computational screens for gene essentiality, including screens for two or more simultaneous knockouts.
Compute the growth rate of all single gene deletions by passing the list of all genes to the single_deletion function, part of the cobra.flux_analysis module.
growth_rates, statuses = cobra.flux_analysis.single_gene_deletion(model, model.genes[:])list(growth_rates.items())[:30]Then, count the number of single gene knockouts that are essential and non-essential. The relatively small fraction of genes that are essential shows the built-in robustness of E. coli metabolism to single-gene deletions.Then, count the number of single gene knockouts that are essential and non-essential. The relatively small fraction of genes that are essential shows the built-in robustness of E. coli metabolism to single-gene deletions.
# We convert the lists to a pandas dataframe for easier data manipulationsingle_gene_deletion_df = pandas.DataFrame.from_dict({"growth_rates": growth_rates, "status": statuses})growth_rate_threshold = 1e-5print('Nb of essential genes: ', len(single_gene_deletion_df[single_gene_deletion_df['growth_rates'] <= growth_rate_threshold]))print('Nb of non-essential genes: ',len(single_gene_deletion_df[single_gene_deletion_df['growth_rates'] > growth_rate_threshold]))#### Double gene deletions (synthetic lethals)FBA analysis can also be used to compute the effects of knocking down pairs of genes. Synthetic lethals (SL) refer to pairs of non‐essential genes whose simultaneous deletion is lethal. Synthetic gene lethality can arise for a variety of reasons. For example, two gene protein products can be interchangeable with respect to an essential function (isozymes), act in the same essential pathway (with each mutation decreasing the flux through that pathway), or operate in two separate pathways with redundant or complementary essential functions. The study of synthetic lethality plays a pivotal role in elucidating functional associations between genes and gene function predictions.The identified SL pairs are phenotypically classified into two types. The first type includes the ones that yield auxotrophic strains that can be rescued through the supply of missing nutrients (i.e. amino acids or other compounds), whereas the second type includes those that lack essential functionalities that cannot be restored by adding extra components to the growth medium.More information:+ COBRApy documentation http://cobrapy.readthedocs.org/en/latest/deletions.html#double-deletions+ Suthers, P. F., Zomorrodi, A., & Maranas, C. D. (2009). Genome-scale gene/reaction essentiality and synthetic lethality analysis. Molecular Systems Biology, 5(301), 1–17. http://doi.org/10.1038/msb.2009.56+ Güell, O., Sagués, F., & Serrano, M. Á. (2014). Essential plasticity and redundancy of metabolism unveiled by synthetic lethality analysis. PLoS Computational Biology, 10(5), e1003637. http://doi.org/10.1371/journal.pcbi.1003637FBA analysis can also be used to compute the effects of knocking down pairs of genes. Synthetic lethals (SL) refer to pairs of non‐essential genes whose simultaneous deletion is lethal. Synthetic gene lethality can arise for a variety of reasons. For example, two gene protein products can be interchangeable with respect to an essential function (isozymes), act in the same essential pathway (with each mutation decreasing the flux through that pathway), or operate in two separate pathways with redundant or complementary essential functions. The study of synthetic lethality plays a pivotal role in elucidating functional associations between genes and gene function predictions.
The identified SL pairs are phenotypically classified into two types. The first type includes the ones that yield auxotrophic strains that can be rescued through the supply of missing nutrients (i.e. amino acids or other compounds), whereas the second type includes those that lack essential functionalities that cannot be restored by adding extra components to the growth medium.
More information:
As an example of the first category of synthetic lethal pair, compute the growth rates of an E. coli strain lacking both cysteine synthase genes _cysK_ (`b2414`), _cysM_ (`b2421`). Double gene deletions can be computed with the `double_deletion` function of the `cobra.flux_analysis` module. The functions takes as argument the cobra model, the list of genes to delete, and the optional argument `return_frame`. The latter optional, when set to `True`, return a pandas frame.Compute the growth rates for the double deletion.As
an example of the first category of synthetic lethal pair, compute the
growth rates of an E. coli strain lacking both cysteine synthase genes cysK (b2414), cysM (b2421). Double gene deletions can be computed with the double_deletion function of the cobra.flux_analysis module. The functions takes as argument the cobra model, the list of genes to delete, and the optional argument return_frame. The latter optional, when set to True, return a pandas frame.
Compute the growth rates for the double deletion.
double_deletion_results = cobra.flux_analysis.double_gene_deletion(model, {model.genes.b2414, model.genes.b2421}, return_frame=True)print('cysK- cysM- double knockout growth rates:')double_deletion_resultsThe pair of genes form a synthetic lethal pair, since the growth rate is almost normal when deleting only one of the two genes, while the simultaneous deletion of both genes leads to growth arrest.The growth can be rescued in silico through the supplementation of the growth medium by L-cysteine (`cys__L_e`). Add this metabolite to the medium by changing the lower bound of its exchange reaction to a small negative amount (reminder: uptake rate are negative) and repeat the double deletion analysis. Observe what happens when changing the uptake rate from -0.05 to -0.5.The pair of genes form a synthetic lethal pair, since the growth rate is almost normal when deleting only one of the two genes, while the simultaneous deletion of both genes leads to growth arrest.
The growth can be rescued in silico through the supplementation of the growth medium by L-cysteine (cys__L_e).
Add this metabolite to the medium by changing the lower bound of its
exchange reaction to a small negative amount (reminder: uptake rate are
negative) and repeat the double deletion analysis. Observe what happens
when changing the uptake rate from -0.05 to -0.5.
# Supplement the medium with L-cysteinemodel.reactions.EX_cys__L_e.lower_bound = -0.2# Compute double deletion growth ratesdouble_deletion_results = cobra.flux_analysis.double_gene_deletion(model, {model.genes.b2414, model.genes.b2421}, return_frame=True)print('cysK- cysM- double knockout growth rates with cysteine-supplemented medium:')# Set back to the default valuemodel.reactions.EX_cys__L_e.lower_bound = 0double_deletion_resultsAs a example of the second category of synthetic lethal pair, disruption of modA (`b0763`) and cysA (`b2422`) results in a strain that cannot be rescued through the addition of the missing compound molybdate (`modb_e`) as the gene disruptions eliminate MOB-Dabcpp (molybdate periplasm transport through ABC system) and thus molybdate cannot be imported into the cell.Print the reaction names associated with the two genes.As a example of the second category of synthetic lethal pair, disruption of modA (b0763) and cysA (b2422) results in a strain that cannot be rescued through the addition of the missing compound molybdate (modb_e)
as the gene disruptions eliminate MOB-Dabcpp (molybdate periplasm
transport through ABC system) and thus molybdate cannot be imported into
the cell.
Print the reaction names associated with the two genes.
print_gene_info(model.genes.b0763)print_gene_info(model.genes.b2422)Compute the growth rates for the double deletion.Compute the growth rates for the double deletion.
double_deletion_results = cobra.flux_analysis.double_gene_deletion(model, {model.genes.b0763, model.genes.b2422}, return_frame=True)print('modA- cysA- double knockout growth rates:')double_deletion_resultsCompute the growth rates for the double deletion with mobdylate-supplemented (`mobd_e`) medium. The supplementation of mobdylate in the medium cannot rescue growth in the double-deletion mutant.Compute the growth rates for the double deletion with mobdylate-supplemented (mobd_e) medium. The supplementation of mobdylate in the medium cannot rescue growth in the double-deletion mutant.
# Supplement the medium with mobdylatemodel.reactions.EX_mobd_e.lower_bound = -1000# Compute double deletion growth ratesdouble_deletion_results = cobra.flux_analysis.double_gene_deletion(model, {model.genes.b0763, model.genes.b2422}, return_frame=True)print('modA- cysA- double knockout growth rates with mobdylate-supplemented medium:')# Set back to the default valuemodel.reactions.EX_mobd_e.lower_bound = -1000double_deletion_results#### Double gene deletion screenDue to the relatively fast computation speed of FBA, double gene deletion analysis can be performed in a high-throughput manner. As an example, we will perform a double gene knockout screen for all genes involved in the central carbohydrates metabolism.Because gene annotations are not available in the iJO1366 cobra model, we will retrieve the gene ontology annotations from the EcoCyc database. From the EcoCyc [web interface](http://ecocyc.org), browse the gene ontology and open the class carbohydrate metabolic process (393 instances).<img src="Images/Ecocyc_1.png" width="500" />Save the gene class as a SmartTable (user account log in is required). In order to list all the genes that are tagged with this gene ontology class, open the new SmartTable and add a transform column "objects annotated to GO terms". Create a new SmartTable from the objects list column (see red arrow).<img src="Images/Ecocyc_2b.png" width="500" />Add a new property column with the gene name corresponding to each protein.<img src="Images/Ecocyc_3b.png" width="500" />From the menu of the right, export the SmartTable to a spreadsheet. Choose common names for the format type, as otherwise the gene names are formatted as EcoCyc objectID by default. Move the file to the same dirctory as this notebook and rename it to "EcoCyc_carbohydrate_metabolic_process_GO.xls".More Information:+ EcoCyc SmartTable documentation http://biocyc.org/PToolsWebsiteHowto.shtml#node_sec_7Due to the relatively fast computation speed of FBA, double gene deletion analysis can be performed in a high-throughput manner. As an example, we will perform a double gene knockout screen for all genes involved in the central carbohydrates metabolism.
Because gene annotations are not available in the iJO1366 cobra
model, we will retrieve the gene ontology annotations from the EcoCyc
database. From the EcoCyc web interface, browse the gene ontology and open the class carbohydrate metabolic process (393 instances).
Save the gene class as a SmartTable (user account log in is required).
In order to list all the genes that are tagged with this gene ontology
class, open the new SmartTable and add a transform column "objects
annotated to GO terms". Create a new SmartTable from the objects list
column (see red arrow).
Add a new property column with the gene name corresponding to each protein.
From the menu of the right, export the SmartTable to a spreadsheet.
Choose common names for the format type, as otherwise the gene names are
formatted as EcoCyc objectID by default. Move the file to the same
dirctory as this notebook and rename it to
"EcoCyc_carbohydrate_metabolic_process_GO.xls".
More Information:
Data tables can be easily imported and handled with the [pandas package](http://pandas.pydata.org/). Pandas brings to Python data structures (dataframe) and data analysis tools similar to R.Import the table with the `pandas.read_table` function.Data tables can be easily imported and handled with the pandas package. Pandas brings to Python data structures (dataframe) and data analysis tools similar to R.
Import the table with the pandas.read_table function.
list_genes_GO = pandas.read_table("EcoCyc_carbohydrate_metabolic_process_GO.xls")# Rename first columnlist_genes_GO.rename(columns={list_genes_GO.columns[0]: 'reaction_name'}, inplace=True)# Drop values with missing gene name (2 cases)list_genes_GO = list_genes_GO.dropna()list_genes_GOFrom this list, find the matching genes in the cobra model. Iterate through the genes in the list, and for each gene, search for the gene in the cobra model with the same exact name (use `re.search` function with the case-insensitive option `re.I`). We will use the `apply` function to iterate over the rows of the dataframe.More information:+ [stackoverflow post](http://stackoverflow.com/questions/15118111/apply-function-to-each-row-of-pandas-dataframe-to-create-two-new-columns): Apply function to each row of pandas dataframe to create two new columns+ Pandas documenation: [Merge, join, and concatenate](http://pandas.pydata.org/pandas-docs/stable/merging.html#ignoring-indexes-on-the-concatenation-axis)From
this list, find the matching genes in the cobra model. Iterate through
the genes in the list, and for each gene, search for the gene in the
cobra model with the same exact name (use re.search function with the case-insensitive option re.I). We will use the apply function to iterate over the rows of the dataframe.
More information:
# Define the function to apply at each row of the dataframedef find_cobra_matching_gene(row): cobra_gene = None for gene in model.genes: if re.search(gene.name, row.Gene, re.I): cobra_gene = gene found = False if cobra_gene is None else True # We must return a pandas.Series in order to be able to add/merge the new columns to the original dataframe return pandas.Series({'cobra_gene_found':found, 'cobra_gene_object':cobra_gene})# Apply the function to the rows (axis=1) of the dataframe and add the result as new columnslist_genes_GO2 = list_genes_GO.merge(list_genes_GO.apply(find_cobra_matching_gene, axis=1), left_index=True, right_index=True)list_genes_GO2Print out how many matching genes were found.Print out how many matching genes were found.
print("Nb of genes in the carbohydrates metabolism category: ", len(list_genes_GO2))print("Nb of matching genes in the cobra model: ", len(list_genes_GO2[list_genes_GO2.cobra_gene_found]))print("Nb of genes absent in the cobra model: ", len(list_genes_GO2[~list_genes_GO2.cobra_gene_found]))The number of matching genes in the cobra model is lower than the number of genes from the EcoCyc database because many genes are not included in the metabolic model. Many of these genes may have unknown function.Compute the double gene deletion growth rate for the 100 first genes in the list of cobra genes associated to the carbohydrates metabolism GO category.The number of matching genes in the cobra model is lower than the number of genes from the EcoCyc database because many genes are not included in the metabolic model. Many of these genes may have unknown function.
Compute the double gene deletion growth rate for the 100 first genes in the list of cobra genes associated to the carbohydrates metabolism GO category.
gene_list = list(list_genes_GO2[list_genes_GO2.cobra_gene_found].cobra_gene_object)[:90]double_deletion_results = cobra.flux_analysis.double_gene_deletion(model, gene_list, return_frame=True)double_deletion_resultsPlot the resulting growth rate for each of the gene pair double deletion as a 2D heat map.Some genes are always essential, and result in a growth rate of 0 when knocked out no matter which other gene is also knocked out (horizontal and vertical blue lines). Other genes form synthetic lethal pairs, where knocking out only one of the genes has no effect on growth rate, but knocking both out is lethal (blue squares).Plot the resulting growth rate for each of the gene pair double deletion as a 2D heat map.
Some genes are always essential, and result in a growth rate of 0 when knocked out no matter which other gene is also knocked out (horizontal and vertical blue lines). Other genes form synthetic lethal pairs, where knocking out only one of the genes has no effect on growth rate, but knocking both out is lethal (blue squares).
color_palette = seaborn.cubehelix_palette(8, start=.5, rot=-0.8, dark=0.3, light=0.9, reverse=True, as_cmap=True)f, ax = plt.subplots(figsize=(16,14))ax = seaborn.heatmap(double_deletion_results, cmap=color_palette, square=True, linewidths=0, cbar_kws={"label":"growth rate h^-1"})ax.tick_params(axis='both', which='major', labelsize=9)From the graph, identify one new synthetic lethal pair, print out the genes information, and compute the growth rates for the corresponding double deletion.From the graph, identify one new synthetic lethal pair, print out the genes information, and compute the growth rates for the corresponding double deletion.
print_gene_info(model.genes.b1276)print_gene_info(model.genes.b0118)double_deletion_results = cobra.flux_analysis.double_gene_deletion(model, {model.genes.b1276, model.genes.b0118}, return_frame=True)print('acnA- acnB- double knockout growth rates:')double_deletion_resultsThis synthetic lethal pair was already indentified in a previous version of the E.coli metabolic model, iAF1260 (Suthers et al. 2009) and described experimentally in (Gruer et al. 1997). _acnA_ and _acnB_ genes encode for two aconitases that catalyze the reversible isomerization of citrate and iso-citrate via cis-aconitate. Deletion of both genes leads to a auxotrophic strain that can be rescued through the addition of glutamate in the medium.<img src="Images/kim2007_metabolic_map_essential_genes.png" width="720" />More information:+ Suthers, P. F., Zomorrodi, A., & Maranas, C. D. (2009). Genome-scale gene/reaction essentiality and synthetic lethality analysis. Molecular Systems Biology, 5(301), 1–17. http://doi.org/10.1038/msb.2009.56+ Gruer, M. J., Bradbury, A. J., & Guest, J. R. (1997). Construction and properties of aconitase mutants of Escherichia coli. Microbiology, 143(6), 1837–1846. http://doi.org/10.1099/00221287-143-6-1837+ Kim, J., & Copley, S. D. (2007). Why metabolic enzymes are essential or nonessential for growth of Escherichia coli K12 on glucose. Biochemistry, 46(44), 12501–12511. http://doi.org/10.1021/bi7014629This synthetic lethal pair was already indentified in a previous version of the E.coli metabolic model, iAF1260 (Suthers et al. 2009) and described experimentally in (Gruer et al. 1997). acnA and acnB genes encode for two aconitases that catalyze the reversible isomerization of citrate and iso-citrate via cis-aconitate. Deletion of both genes leads to a auxotrophic strain that can be rescued through the addition of glutamate in the medium.

More information:
Compute the double deletion growth rates when supplementing the medium with glutamate.Compute the double deletion growth rates when supplementing the medium with glutamate.
# Supplement the medium with glutamatemodel.reactions.EX_glu__L_e.lower_bound = -1# Compute double deletion growth ratesdouble_deletion_results = cobra.flux_analysis.double_gene_deletion(model, {model.genes.b1276, model.genes.b0118}, return_frame=True)print('acnA- acnB- double knockout growth rates in glucose minimal medium supplemented with glutamate:')# Set back to the default valuemodel.reactions.EX_glu__L_e.lower_bound = 0double_deletion_results